home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Skunkware 5
/
Skunkware 5.iso
/
man
/
cat.1
/
st.1
< prev
next >
Wrap
Text File
|
1995-07-25
|
10KB
|
265 lines
SSSSTTTT((((1111)))) XXXXEEEENNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSTTTT((((1111))))
NNNNAAAAMMMMEEEE
st - Sound Tools - sound sample file and effects libraries.
SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS
cccccccc _f_i_l_e._c ----oooo _f_i_l_e lllliiiibbbbsssstttt....aaaa
DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
_S_o_u_n_d _T_o_o_l_s is a library of sound sample file format
readers/writers and sound effects processors.
Sound Tools includes skeleton C files to assist you in
writing new formats and effects. The full skeleton driver,
skel.c, helps you write drivers for a new format which has
data structures. The simple skeleton drivers help you write
a new driver for raw (headerless) formats, or for formats
which just have a simple header followed by raw data.
Most sound sample formats are fairly simple: they are just a
string of bytes or words and are presumed to be sampled at a
known data rate. Most of them have a short data structure
at the beginning of the file.
IIIINNNNTTTTEEEERRRRNNNNAAAALLLLSSSS
The Sound Tools formats and effects operate on an internal
buffer format of signed 32-bit longs. The data processing
routines are called with buffers of these samples, and
buffer sizes which refer to the number of samples processed,
not the number of bytes. File readers translate the input
samples to signed longs and return the number of longs read.
For example, data in linear signed byte format is left-
shifted 24 bits.
This does cause problems in processing the data. For
example:
*obuf++ = (*ibuf++ * *ibuf++)/2;
would _n_o_t mix down left and right channels into one
monophonic channel, because the resulting samples would
overflow 32 bits. Instead, the ``avg'' effects must use:
*obuf++ = *ibuf++/2 * *ibuf++/2;
Stereo data is stored with the left and right speaker data
in successive samples. Quadraphonic data is stored in this
order: left front, right front, left rear, right rear.
FFFFOOOORRRRMMMMAAAATTTTSSSS
A _f_o_r_m_a_t is responsible for translating between sound sample
files and an internal buffer. The internal buffer is store
in signed longs with a fixed sampling rate. The _f_o_r_m_a_t
operates from two data structures: a format structure, and a
private structure.
The format structure contains a list of control parameters
Page 1 (printed 7/27/94)
SSSSTTTT((((1111)))) XXXXEEEENNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSTTTT((((1111))))
for the sample: sampling rate, data size (bytes, words,
floats, etc.), style (unsigned, signed, logarithmic), number
of sound channels. It also contains other state
information: whether the sample file needs to be byte-
swapped, whether fseek() will work, its suffix, its file
stream pointer, its _f_o_r_m_a_t pointer, and the _p_r_i_v_a_t_e
structure for the _f_o_r_m_a_t .
The _p_r_i_v_a_t_e area is just a preallocated data array for the
_f_o_r_m_a_t to use however it wishes. It should have a defined
data structure and cast the array to that structure. See
voc.c for the use of a private data area. Voc.c has to track
the number of samples it writes and when finishing, seek
back to the beginning of the file and write it out. The
private area is not very large. The ``echo'' effect has to
malloc() a much larger area for its delay line buffers.
A _f_o_r_m_a_t has 6 routines:
startread Set up the format parameters, or read in
a data header, or do what needs to be
done.
read Given a buffer and a length: read up to
that many samples, transform them into
signed long integers, and copy them into
the buffer. Return the number of
samples actually read.
stopread Do what needs to be done.
startwrite Set up the format parameters, or write
out a data header, or do what needs to
be done.
write Given a buffer and a length: copy that
many samples out of the buffer, convert
them from signed longs to the
appropriate data, and write them to the
file. If it can't write out all the
samples, fail.
stopwrite Fix up any file header, or do what needs
to be done.
EEEEFFFFFFFFEEEECCCCTTTTSSSS
An effects loop has one input and one output stream. It has
5 routines.
getopts is called with a character string
argument list for the effect.
Page 2 (printed 7/27/94)
SSSSTTTT((((1111)))) XXXXEEEENNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSTTTT((((1111))))
start is called with the signal parameters for
the input and output streams.
flow is called with input and output data
buffers, and (by reference) the input
and output data sizes. It processes the
input buffer into the output buffer, and
sets the size variables to the numbers
of samples actually processed. It is
under no obligation to fill the output
buffer.
drain is called after there are no more input
data samples. If the effect wishes to
generate more data samples it copies the
generated data into a given buffer and
returns the number of samples generated.
If it fills the buffer, it will be
called again, etc. The echo effect uses
this to fade away.
stop is called when there are no more input
samples to process. _s_t_o_p may generate
output samples on its own. See echo.c
for how to do this, and see that what it
does is absolutely bogus.
CCCCOOOOMMMMMMMMEEEENNNNTTTTSSSS
Theoretically, formats can be used to manipulate several
files inside one program. Multi-sample files, for example
the download for a sampling keyboard, can be handled cleanly
with this feature.
PPPPOOOORRRRTTTTAAAABBBBIIIILLLLIIIITTTTYYYY PPPPRRRROOOOBBBBLLLLEEEEMMMMSSSS
Many computers don't supply arithmetic shifting, so do
multiplies and divides instead of << and >>. The compiler
will do the right thing if the CPU supplies arithmetic
shifting.
Do all arithmetic conversions one stage at a time. I've had
too many problems with "obviously clean" combinations.
In general, don't worry about "efficiency". The sox.c base
translator is disk-bound on any machine (other than a 8088
PC with an SMD disk controller). Just comment your code and
make sure it's clean and simple. You'll find that DSP code
is extremely painful to write as it is.
BBBBUUUUGGGGSSSS
The HCOM format is not re-entrant; it can only be used once
in a program.
Page 3 (printed 7/27/94)
SSSSTTTT((((1111)))) XXXXEEEENNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSTTTT((((1111))))
The program/library interface is pretty weak. There's too
much ad-hoc information which a program is supposed to
gather up. Sound Tools wants to be an object-oriented
dataflow architecture.
The human ear can't really hear better than 20 bits. With
an internal format of 16 bits, we will eventually destroy
information when used to mix CD's. The internal format
should be 24-bit signed data. But, with 24 bits you still
have to be careful multiplying. Check the ``vibro'' effect
for how it handles this problem.
Page 4 (printed 7/27/94)